Corpus: mlt_web_2012_300K

Other corpora

4.4.1.5 Number of Word-N-grams at Sentence Endings

Number of word-N-grams for N=1...5 for the first K sentences

K # of words # of bigrams # of trigrams # of 4-grams # of 5-grams
100 98 99 99 99 99
1000 830 986 997 999 999
10000 6242 9322 9863 9966 9981
100000 34935 80058 94602 98144 98907
1000000 72654 210894 270956 288876 293355


Zipf's diagram for sentence endings


Gnuplot diagram

32267 msec needed at 2018-05-25 11:31